Efficient modification of LZSS compression algorithm
نویسنده
چکیده
This paper presents a new method of lossless data compression called LZPP, being an advanced modification of the well-known algorithm LZSS [1]. It introduces improvements of the LZ family algorithms [2, 3], such as the use of a special coding of two and three byte matches, use of an auxiliary entropy coder and new criteria of symbol exclusions. Minimization of the data compression ratio (bpc) has been chosen as the primary goal of the proposed modifications of LZSS algorithm. 1. Analysis of LZSS algorithm and proposed modifications 1.1. Index coding The first thing which should be considered, when one wants to improve the quality of compression, are indexes generated by every LZ method. Indexes can be coded with one of the statistical adaptive compression methods, like the arithmetic or the Huffman coding. Confirmation of this is the sample probability distribution of indexes (calculated as the distance from the end of the window) shown in Figure 1, which LZSS method generates during compression of the “xplik” file (merged all files from Calgary Corpus). Graphs for the most of the files for which experiments were conducted had similar shapes. Therefore it can be assumed that generally indexes have such distribution. Moreover, symbols having such distribution should be compressed well with a statistical coder, because the entropy (see [4]) equals H(S) = 8,954 bits in this case. This means that the maximal static entropy coding would reduce the number of bits necessary to store indexes even by 44%. Using this fact in the LZPP method, the fast variant of arithmetic coding (RangeCoder [5]) was applied for index coding. The range coder was additionally equipped with a rotating buffer, handling of the escape marker and the exclusions mechanism. Additionally, an independent coding of index bytes ∗ E-mail address: [email protected] PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com Pobrane z czasopisma Annales AIInformatica http://ai.annales.umcs.pl Data: 06/09/2017 07:08:20
منابع مشابه
The relative efficiency of data compression by LZW and LZSS
We explore the use of the LZW and LZSS data compression methods. These methods or some versions of them are very common in use of compressing different types of data. Even though on average LZSS gives better compression results, we determine the case in which the LZW perform best and when the compression efficiency gap between the LZW algorithm and its LZSS counterpart is the largest.
متن کاملOptimizing LZSS compression on GPGPUs
In this paper, we present an algorithm and provide design improvements needed to port the serial Lempel–Ziv–Storer–Szymanski (LZSS), lossless data compression algorithm, to a parallelized version suitable for general purpose graphic processor units (GPGPU), specifically for NVIDIA’s CUDA Framework. The twomain stages of the algorithm, substringmatching and encoding, are studied in detail to fit...
متن کاملThe Block Lossless Data Compression Algorithm
The mainstream lossless data compression algorithms have been extensively studied in recent years. However, rather less attention has been paid to the block algorithm of those algorithms. The aim of this study was therefore to investigate the block performance of those methods. The main idea of this paper is to break the input into different sized blocks, compress separately, and compare the re...
متن کاملA New Compression Method for Compressed Matching
A practical adaptive compression algorithm based on LZSS is presented, which is especially constructed to solve the compressed pattern matching problem, i.e., pattern matching directly in a compressed text without decompressing.
متن کاملDelta Encoding in a Compressed Domain
A delta compression algorithm is presented, working on an LZSS compressed reference file and an uncompressed version, and producing a delta file that can be used to reconstruct the version file directly in its compressed form. This has applications to accelerate data flow in network environments.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Annales UMCS, Informatica
دوره 1 شماره
صفحات -
تاریخ انتشار 2003